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Abstract 

The goal of this paper is the development of a novel 
approach for the problem of Noise Removal, based 
on the theory of Reproducing Kernels Hilbert Spaces 
(RKHS). The problem is cast as an optimization task in 
a RKHS, by taking advantage of the celebrated semi- 
parametric Representer Theorem. Examples verify that 
in the presence of gauss ian noise the proposed method 
performs relatively well compared to wavelet based 
technics and outperforms them significantly in the pres- 
ence of impulse or mixed noise. 

1 Introduction 

The problem of noise removal from a digitized im- 
age is one of the most fundamental ones in digital im- 
age processing. So far, various techniques have been 
proposed to deal with it. Among the most important 
methodologies are, for example, the Wavelet-based im- 
age denoising methods, which dominates the research 
in recent years G1I3). In this paper we propose a novel 
approach which (to our knowledge) has not been con- 
sidered before. We employ the well known powerful 
tool of kernels. 

In kernel methodology the notion of the Reproduc- 
ing Kernel Hilbert Space (RKHS) plays a crucial role. 
A RKHS, is a rich construct (roughly, a smooth space 
with an inner product), which has been proven to be a 
very powerful tool for non linear processing JUQj]]. In 
the denoising problem, we exploit a useful property of 
RKHS, the representer theorem \9\. It states that the 
minimizer of any optimization task in %, with a cost 
function of a certain type, has a finite representation in 
H. We recast the image denoising problem as an opti- 
mization task of this type and use the semi-parametric 
version of the representer theorem. The latter, allows 



for explicit modeling of the edges in an image. In such 
a way we can deal with the smoothness which is, im- 
plicitly, imposed by the "smooth" nature of RKHS. 

Though there has been some work exploring the use 
of kernels in the denoising problem, the methodology 
presented here is fundamentally different. In [10], the 
notion of kernel regression has been adopted. The orig- 
inal image is formulated as a Taylor approximation se- 
ries around a center, Xi, and data adaptive kernels are 
used, as weighted factors, to penalize distances away 
from Xi. In a relatively similar context, kernels have 
been employed by other well known denoising meth- 
ods (such as Q). Kernels were also used in the context 
of RKHS in (51 . However, the obtained results were 
not satisfying, especially around edges. It is exactly this 
drawback that is addressed by our method. 

2 Mathematical Preliminaries 

We start with some basic definitions regarding 
RKHS. Let X be a non empty set with a?i, . . . , xn G 
X. Consider a Hilbert space % of real valued functions 
/ defined on a set X, with a corresponding inner prod- 
uct (-r)n- We will call H as a Reproducing Kernel 
Hilbert Space - RKHS, if there exists a function, known 
as kernel, k : X x X — >• R with the following two 
properties: 

1. For every x G X, k(x, •) belongs to %. 

2. n has the so called reproducing property, i.e. 
f(x) = (f,K,(x,')) n , for all / G U. In partic- 
ular k(x, y) = (k(x, •), n(y, -))n- 

In can been shown that the kernel n produces the en- 
tire space 7-L, i.e. H = span{^(x, -)\x G X}. There 
are several kernels that are used in practice (see l9l). In 
this work, we focus on one of the most widely used, the 



Gaussian Kernel: 



k(x, y) = exp 



\\x-y\ 

2a 2 



,cr > 0, 



due to some additional properties that it admits. 

One of the many powerful tools in kernel theory is 
the application of the semi-parametric representer theo- 
rem to regularized risk minimization problems (see |9 ]): 

Theorem 2.1. Denote by ^1,^2 : [0, 00) -» R, two 

strictly monotonic increasing functions, by X a set and 
by c : (X x R 2 ) m -^RU {00} an arbitrary loss func- 
tion. Furthermore, consider a set of M real-valued 



functions {^k} 



M 
k=l 



with the property that 



the N x M matrix (^ p (x n )) n ^ has rank M. Then any 
f := / + h, with f G T~L and h G = span{ifjk}, 
minimizing the regularized risk functional 

c ((xi, zx, f(xx)), . . . , (x N , z N , f(x N )) 
+ (li(\\f\\n)+ih(\\hh) 



admits a representation of the form 



N 



A I 



f(x) = ^ a nK(x n , X) + ^ Mk(x). (1) 



k=l 



Usually the regularization term fi(/) takes the form 
^U) = \\\f\\n- In the case of the RKHS produced by 
the gaussian Kernel we can prove that 



\n 



/ 2n 



with 2n = A n and 2n+1 = VA n , A being 
the Laplacian and V the gradient operator (see l9l). 
Thus, we see that the regularization term "penalizes" 
the derivatives of the minimizer. This results to a very 
smooth solution of the regularized risk minimization 
problem. 

Note that according to theorem 12.11 the model of a 
function has two parts, one lying in the smooth RKHS 
space and another part h which gives rise to the second 
term in the expansion (Q]). It is exactly this term that 
is exploited by our method in order to explicitly model 
edges. 

3 Application to the denoising problem 

Let / be the original image and / the noisy one (we 
consider them as continuous functions). Also, let fij 
and fij be the restrictions of / and / on the N x N or- 
thogonal region centered at the pixel (i, j) of each im- 
age accordingly (N is an odd number). Our task is to 




Figure 1. Two of the functions ip k that are 
used to represent edges. 



find fij from the given samples of fij. For simplicity, 
we drop the z, j indices and consider fij and fij (which 
from now on will be written as / and /) as functions 
defined on [0, l] 2 (and zero elsewhere). The pixel val- 
ues of the digitized image are given by f(x n ,y m ) and 
f(x nj y m ) where x n = n/(N - 1), y m m/(N - 1) 
for n, m = 0, 1, . . . , N — 1. 

We consider a set of real valued functions {ipk, k = 
1,...,K} with two variables suitable to represent 
edges; i.e., bivariate polynomials (which are controlled 
by the coefficients ho,hi : h2, hs) and functions of the 
form Erf (a • x + b • y + c), where Erf is the error func- 
tion, 

Erf (a) = —= e~ t2 dt, 
Jo 

for several suitable choices of a, b and c (see figure 
[B. Thus we formulate the regularized risk minimiza- 
tion problem as follows: 

N-l M-l 

minimize \f(x n , y m ) + h + h x x n + 

n=0 m=0 
K 



+ h 2 ym + h 3 X n y m + y^Pk^k(x n ,ym) - f( x n,ym) 

A 



k=i 



^\\f\\ 2 n + fEl^l 2 + yE^ 



(3) 



k=i 



1=1 



Taking a closer look at the term according 
to equation ©, one sees that we actually penalize 
the derivatives of / in a more influential fashion than 
the total variation scheme, which is often used in 
wavelet-based denoising and penalizes only the first or- 
der derivatives. It turns out that in our method the use of 
the L\ norm in the cost function, in combination with 
regularization, results in sparse modeling with respect 
to the (3 coefficients. It should be noted that the use of 
the Li norm, also, in the regularization term leads to 
similar results. 

The semi-parametric theorem 12.11 ensures that the 



Tab 



e 1. Results on Boat corrupted by impulse noise. 



Image 


Noise 


noisy PSNR 


Kernel Denoising 


BiShrink 2 


K-SVD 4 


SKR L i 10 


SKR 10 


BM3D 3 




20% 


18.56 dB 


32.36 dB 


22.59 dB 


26.46 dB 


31.85 dB 


28.35 dB 


29.45 dB 




30% 


16.77 dB 


30.66 dB 


25.07 dB 


26.79 dB 


30.85 dB 


27.05 dB 


28.29 dB 


Boat 


40% 


15.52 dB 


29.14 dB 


25.40 dB 


26.08 dB 


29.51 dB 


25.85 dB 


27.26 dB 




50% 


14.55 dB 


28.10 dB 


25.09 dB 


25.38 dB 


27.73 dB 


24.90 dB 


26.61 dB 





rable 2. 


Results on Lena corrupted 


by gaussian noise. 


Image 


Noise 


noisy PSNR 


Kernel Denoising 


BiShrink 2 


BLS-GSM 8 


K-SVD 4 


SKRZ^ 10 


SKR 10 


BM3D 3 


Lena 


s = 10 
s = 20 
s = 30 


28.12 dB 
22.14 dB 
18.72 dB 


33.98 dB 
31.12 dB 
29.11 dB 


34.33 dB 
31.17 dB 
29.35 dB 


35.60 dB 
32.65 dB 
30.50 dB 


35.47 dB 
32.36 dB 
30.30 dB 


32.66 dB 
29.23 dB 
26.60 dB 


35.32 dB 
32.62 dB 
30.71 dB 


35.93 dB 
33.00 dB 
31.21 dB 



minimizer will have a finite representation of the form: 



5 Conclusions 



N-l M-l 



f(x,y) =^2 a n,m^(^n,^m), (>,?/)) + 

n=0 m=0 

M 

+ ^2 Mk(x, y) + h + hix + h 2 y + h 3 xy. 
k=i 

We can solve this problem using Polyak's Projected 
Subgradient Method Q. We fix the regularization pa- 
rameter A and adjust fi and /ii so that they take small 
values around edges and large values in smooth areas. 
In particular, as the algorithm moves from one pixel 
to the next, it decides whether the corresponding pixel 
centered region contains edges or not using the mean 
gradient of the specific region and then, it solves the 
corresponding minimization problem. 

4 Experimental Results 

Figure [2] and tables [T] [2] show the obtained re- 
sults using our algorithm on the Lena and Boat 
(512 x 512) grayscale images. More experimental re- 
sults, the code in C (for the proposed methodology), as 
well as details on the implementation may be found at 
|http : //cgi . di . uoa . gr/~stheodor/ker_den/ 
The results were compared with those obtained us- 
ing several state of the art wavelet-based denois- 
ing packages, which are available on the internet 
([3l[T0lHl[l][2)). The experiments show that the kernel 
approach performs equally well as the well-known 
BiShrink wavelet-based method [2\ in the presence of 
Gaussian noise. However, it outperforms significantly 
the other denoising methods when impulse noise 
or mixed noise are considered (see figure [2]). This 
enhanced performance is obtained at the cost of higher 
complexity, which is basically contributed by the 
optimization step, which is of the order of O(MN) per 
pixel. Currently, more efficient optimization algorithms 
are considered. Moreover, the whole setting is open 
to a straightforward parallelization, when a parallel 
processing environment is available. This is also 
currently under consideration. 



A novel denoising algorithm was presented based 
on the theory of RKHS. The semiparametric Represen- 
ter Theorem was exploited in order to cope with the 
problems associated with the smoothing around edges, 
which is a common problem in almost all denoising 
algorithms. The comparative study against other de- 
noising techniques, showed that significantly enhanced 
results are obtained in the case of impulse noise and 
mixed noise. 
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Figure 2. (a) Original Image, (b) Original with additive Gaussian Noise - PSNR=22.14 dB, (c) Wavelet BiShrink Denoising (2) - 
PSNR=31.17 dB, (d) Kernelized Denoising - PSNR=31.12 dB, (e) Original Image, (f) Original with additive Impulse Noise - PSNR=1 5.52 
dB, (g) BM3D Denoising [3] - PSNR=27.26 dB, (h) Kernelised Denoising - PSNR=29.14 dB. 



